Skip to main content

L1 & L2 Regularization effect on models weights

The following histograms display the weights of the 3 layers in the CNN. We can see the effect of L1 and L2 regularization by observing how the weights change over time.
Created on January 18|Last edited on March 19

Baseline

As expected we can see that the Baseline Model with no regularization contains a wide range of values between [1, -2.5].

20406080100Step-2-101
20406080100Step-2-101
20406080100Step-2-1.5-1-0.500.51
20406080100Step-2-1.5-1-0.500.51
20406080100Step-1-0.500.51
20406080100Step-1-0.500.51
Run: Baseline
1


L1 Regularization

We can immediately see how L1 regularization eliminates features in the model. The histograms show a lot narrower density distribution around 0 compared to the baseline. Especially the feature selection is visible by observing how narrow the peak of the weight distribution is getting. The range of the values remains [1, -2].

Run: L1
1


L2 Regularization

The heavy feature shrinkage of L2 regularization is perfectly visible when comparing the histograms with the above mentioned ones. The range of the weights shrinks down from [1, -2] to [0.6, -1.5]

Run: L2
2